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change of the coupling strength between two chaotic systems can be captured by 
the proposed approach. It is also discussed how the kernel parameter is suitably 
chosen from data. 
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1. Introduction 

Synchronization of chaotic systems has been an active research area in recent years ^ . 
Now the notion of synchronization of chaotic systems is extended far beyond complete 
synchronization between identical systems |21 El and various kinds of nonlinear 
synchronization have been proposed A significant extension is generalized 

synchronization (GS) which is defined by a time- independent nonlinear functional 
relation y = ^'(x) between the states x and y of two systems X and Y. 

Experimental detection and characterization of GS from observed data is a 
challenging problem, especially in biology; e.g., for study on nonlinear interdependence 
observed in binding of different features in cognitive process and epilepsies in the 
brain 011]. 

In unidirectionally coupled systems, a way to detect GS is to make an identical 
copy Y' of the response system Y driven by the common signal from the driver system 
X, then investigate whether orbits of both Y and Y' coincide after transient P llOj. 
However, it is very difficult to prepare an identical copy of the response system, and 
this approach does not give any information on the structure of the synchronization 
manifold M — {{x, y) : y = '^{x)} in the state space. 

On the other hand, several indices have been proposed 13 El El El E] , to 
quantify nonlinear dependence between X and Y based on a local relation between 
them. While the usefulness of these indices were shown in some examples [3 El E] > 
those approaches are still ad- hoc, lacking systematic methods for unifying the local 
relation of different regions in the state space. 

Recently, kernel methods have been attracting much attention in the machine 
learning community as powerful tools for analyzing data with high nonlinearity 
|14l 1151 [T?)l I17j . In this paper, we propose a novel approach for analyzing GS based 
on Kernel Canonical Correlation Analysis (Kernel CCA) ^1 El EHI 1^ which is 
a version of the kernel methods [22] J- We will demonstrate that the Kernel CCA 
provides a suitable index for measuring the nonlinear interdependence, and gives global 
coordinates characterizing the synchronization manifold Ai of GS. Note that the latter 
has not been addressed in conventional studies on the analysis of GS. The proposed 
method does not require explicit knowledge on the underlying equations behind time 
series. Although we restrict our attention to the analysis of numerical experiments 
here, it is also applicable to experimental data. 

The present paper is organized as follows. In section 2, we introduce a formulation 
of the Kernel CCA. Sections 3 and 4 provide several successful applications of the 
Kernel CCA for treating GS. In section 5, it is demonstrated that the nonstationary 
change of the coupling strength between two chaotic systems can be captured by the 
proposed approach. In section 6, we discuss an issue on the choice of kernel parameters. 
Our conclusion is given in section 7. 

2. Kernel Canonical Correlation Analysis 

Canonical Correlation Analysis (CCA) is a multivariate analysis method to find a 
pair of vectors that maximize the correlation coefhcient between projections of signals 
onto them (2^1 ■ CCA is useful for detecting a linear relation between a pair of 
multidimensional data sets, but cannot deal with the case like GS where there is 

t Preliminary results are reported in a conference proceedings |22l with a simpler example of detection 
of GS. Here, we give comprehensive treatments with examples of practical interests 
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a high nonhnear relation between two data sets. Extending the ability of CCA for 
analyzing data with nonlinearity, kernelization of CCA has been proposed by several 
researchers ^1^1 1201 121- An intuitive formulation of the Kernel CCA is as follows. 

For a pair of multivariate variables (x,y) with x G and y G R"*, the Kernel 
CCA seeks a pair of nonlinear scalar functions / : ^ M and g : ^ K. such that 
an estimator of the correlation coefficient 
py, = cov(/(g;),ff(7/)) 

v/var(/(a;))^var(.g(y)) 

between f{x) and g{y) is maximized. 

Methods based on the expression of the generalized correlation coefficient have 
a long history |24II25| . However the use of the kernel approach gives a notable progress 
in the subject, with a remarkably simple and flexible way of treating (Q. 

The essence of the kernel methods is an assumption that nonlinear functions / 
and g are well approximated by linear combinations of kernels on data points (a;„, y„), 

N N 

f{x)^^ank{xn,x), g{y) = '^(3nk{yn,y), (2) 

n—l n—1 

where {(x„, yn)}n=i denotes a training data set of (x, y). 

Examples of kernels are k{x,x') — e 2^ (Gaussian), {{x ■ x') + 
0)'^) (Polynomial), and tanh(?7(a;-a;')+6') (Sigmoidal) |15lll6l[T7) . A rich representation 
of nonlinear functions is allowed by suitably chosen weight coefficients {an}^^Lij 
{Pn]n=i s-iid kernel parameters. At first sight, the dependence of the expressions for / 
and g on the given data {(a;„, yn)^n=i brings some ad-hoc nature to the method. The 
assumption is, however, equivalent to the assumption that / and g are minimum norm 
functions in a Reproducing Kernel Hilbert Space (RKHS). It is proved by invoking 
representer theorem in the theory of RKHS ^1 ^1 E| • 

Substituting ^ into (Q), and replacing covariance cov(-, •) and variances var(-) 
with empirical averages over the data set {{xn,yn)}n=n ^^^^ Kernel CCA reduces to 
the maximization of 

where a = *{ai, a2, un), P = *(/?!, /32, /3jv), and Kx,Ky are the Gram matrices 
{Kx)i.j — k{xi, Xj) and {KY)i,j = k{yi, yj) defined with the given sample. The Gram 
matrix contains relevant topological information of data points in the state space. For 
simplicity, we tentatively assume that the averages of {f{xn)}n=i aiid are 
zero. Later we will show that subtraction of the averages from data can be done in 
an implicit way and does not cause any technical difficulty. 

A naive generalization of CCA leads to the maximization of ^aKxKyP subject 
to *aK\a — ^fiKyfi — 1. The corresponding Lagrangian is 

£0 = 'aKxKvP - ^CaKla - 1) - ^(*/3if|^/3 - 1), (4) 

where px and py are Lagrange multipliers. Taking derivatives of Cq with respect to 
a and /3, and px — Py = P from the constraint, the Kernel CCA is formulated as the 
following generalized eigenvalue problem: 



Detecting Generalized Synchronization by A Kernel-based Approach 



4 



There is, however, a difficuhy when the Gram matrices Kx and Ky are invertible. 



By multiplying both sides of jSJ by {K K , Ky^ Ky^) from the left hand side, we 



have 

a = ^K^'Kyfi, (6) 

P = -Ky'Kxa. (7) 
P 

Substituting |71l into 0, we obtain 

la = p^a, (8) 

which gives a trivial solution p = ±1. Such a situation is not uncommon, because 
the Gram matrix is invertible when a Gaussian kernel is used and data points are 
distinct each other. Thus the naive kernelization ||SJ) of CCA does not provide useful 
information on the nonlinear dependence. 

To overcome this problem we introduce small regularization terms in the 
denominator of the right hand side of Q as 

- (9) 
^^aK-j,a + n\\fr^^^(3K'i^(3 + K\\gr/ 

where and WgW^ are quadratic norms of / and g on the RKHS defined as 

N N N 

11/11^ = X! ""'/(^"O ^ X! y^,Cln'ank{Xn,Xn') 
n' — l n' — l n— 1 

= *aKxa, (10) 

N N N 

n' — l n' — l n— 1 

= *PKyP, (11) 

and K is its parameter. Note that although the parameter k ^ is required for 
a nontrivial result, the precise value of k is not important§ This insensitivity to 
the value of k will be confirmed by a numerical experiment in the next section. 
As well as (jSj), the maximization of the numerator ^aKxKyf^ in subject to 
^aKj^a + n'^aKxa = ^PKyp + k*/3Ky(3 = 1 reduces to the following generalized 
eigenvalue problem 

KxKy \ f a 
KyKx ){ P 

( Kx{Kx + Kl) n9^ 

= P\ Ky{Ky + nI))[p)- (12) 

The first eigenvalue of (|12|l gives the maximal value p^^^ of pjr in © . p^'^^ is called the 
canonical correlation coefficient, and the variables u — f{x) and v — g{y) transformed 
by / and g are called the canonical variates of the Kernel CCA. 

§ Recently, statistical consistency of the Kernel CCA, i.e., convergence of the estimators of / 
and g to the optimal ones is proved when / and g belong to a RKHS, under the condition that 
k/N ~ Ar-i/3+'5(o < 5 < 1/3) with Af ^ oo. Fukumizu K, Bach F R, and Gretton A 2005 
Consistency of kernel canonical correlation analysis, ISM Research Memorandum No. 942 
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So far, the averages of {f{xn)}n=i and {g{yn)}n=i zero. In fact, it is 

not necessary to subtract the averages exphcitly, because the subtraction of the means 
is equivalent to the replacement of the Gram matrices K with 

K = K-^0 %K lif (j *j) + ^(j (13) 

where j is a A'^-dimensional vector such that each component equals to the unity jl5l 

nniini. 

In the examples of this paper, the data {xn}n=i ^^'^ {yn}n=i a-re normalized 
within an unit interval [0, 1] before applying the Kernel CCA. 

3. Detecting Generalized Synchronization by Kernel CCA 

The kernel methods such as the Kernel CCA have been mainly applied to problems 
of pattern recognition J7] and bioinformatics (27] ■ In this paper, we propose that the 
Kernel CCA can also be a powerful tool for analyzing nonlinear dynamics. 

Voss et al. 1261 proposed a method for analyzing dynamical system by the 
maximization the expression It is based on the Alternating Conditional 

Expectation (ACE) algorithm where the transformations / and g in are 

restricted to a linear combination of univariate functions such as f{xi, X2, ■ ■ ■ , Xp) = 
^i{xi) ^ while the Kernel CCA allows any nonlinear function f{xi, X2, ■ ■ ■ , Xp) of p 
variables in a RKHS. This difference makes the proposed method much more powerful 
than the one based on ACE, especially in the cases where nonlinear dependence 
between variables xi,X2, ... is essential. Unlike ACE, the Kernel CCA does not need 
any iterative procedure and it is enough to solve a generalized eigenvalue problem by 
standard numerical algebra, only once for each set of data and parameters. 

In this section, we study two unidirectionally coupled Henon maps ^I] as an il- 
lustrative example and demonstrate the ability of the proposed approach. 



3.1. Quantitative Characterization of Generalized Synchronization 

Two unidirectionally coupled Henon maps are described by the following difference 
equations: 

r a;i(< + l) = 1.4-xi(t)2 + 0.3x2(t) , . 

■ 1 X2{t+\)=X,{t), ^'^> 

. r yi{t + 1) = 1.4 - {7xi(i)2/i(0 + (1 - 7)2/1 W'} + 0.12/2(0 
^ ■ \ y2(i+l) = yi(0, ^'^^ 

where x{t) = {xi{t)^X2{t)) and y{t) = (yi{t),y2{t)) are state variables at time t, and 
7 denotes the coupling strength between two systems X and Y . 

Figures n (a) and (b) show the projections of the strange attractors of (jl4(l and 
p5|l with 7 = 0.0 and 0.25 onto the {xi,yi) plane, respectively. When the coupling 
strength 7 is large, a complicated driver-response relation with high nonlincarity is 
formed between two different chaotic dynamics. It can be identified in figure ^ (b) 
with a visual inspection, but it is not easy to represent it by naive statistical tools 
such as correlation coefficients. 

We apply the Kernel CCA to the cases shown in figures ^ (a) and (b). Here, we 
employ a Gaussian kernel with a = 0.1 and n is set to 0.3. We prepare an orbit with 
length N = 2 X 10'^ as training data. As already remarked, each of variables xi, X2, 
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Figure 1. (a, b) Projections of strange attractors of the coupled Henon maps 
onto the {xi,yi) plane with 7 = 0.0 in (a) and 7 = 0.25 in (b). For both of (a) 
and (b), an orbit with length 5 X 10* is used for plotting, (c, d) Scatter plots of 
the canonical variates of the Kernel CCA with 7 = 0.0 in (c) and 7 = 0.25 in (d). 



2/1, and ?/2 is normalized within the unit interval [0, 1]. Figures ^(c) and (d) illustrate 
results of the Kernel CCA. Figuren(d) shows that GS is clearly identified as a cloud 
of points along the diagonal on the plane of the canonical variates u and v. On the 
other hand, when X and Y are independent, the correlation between the canonical 
variates is very weak as shown in figure ^ (c). Figure [21 (a) shows the dependence 
of the canonical correlation coefficient p'^^^ on the coupling strength 7. The rapid 
increase of p'^'^^ against 7 in an interval 0<7<0.17is indicated. 

We will show that the Kernel CCA can be used with time-delay embedding scheme 
for time series. In time-delay embedding scheme, a pair of c?-dimensional vectors 
{xi{t),xi{t - I), ...,xi{t - (rf - 1)0 and {yi{t),yi{t ~ I), ...,yi{t - (d - 1)0, where I 
denotes the delay time, is used as a sample of training data. Figure [5] (b) shows the 
graph of pji^^ vs. 7 for two different values of the embedding dimension d. Shapes of 
the graphs shown in figure[21(b) does not change significantly compared to those shown 
in figure El (a). This result indicates that the Kernel CCA works for data obtained by 
using the time-delay embedding scheme. 

3.2. Comparison with A Conventional Method 

In EH, an identification of GS based on the occurrence of the complete 
synchronization between the response system and its identical copy is proposed. For 
the coupled Henon maps H14|l and H15(l . we investigate the synchronization errors 
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Figure 2. (a) The canonical correlation coefficient of the Kernel CCA as 

a function of the coupling strength 7 for three different values of a with k = 0.1 
and N = 2 X 10^. Inset An enlargement of (a), (b) The index p^'^ vs. 7 
for the embedding dimension d = 4 and d = 6 with cr = 1.0, At = 0.1, and 
AT = 2 X 10^. The delay time I for embedding is set to 1. (c) The time average of 
the synchronization error between the response system and its identical copy as 
a function of 7. (d) The maximal Lyapunov exponent \r of the response system 
as a function of 7. 
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{\\y ~ y'^°^^\\'^)t between the system Y and its identical copy Y' with variables y'^°py = 
l^j^copy^ ^copy-j^ The rcsult is shown in figure |21(c). Here, the index {\\y — y'^°^^\\'^)t 
between Y and Y' is measured by the average of 10^ time steps and plotted as a 
function of 7. The index (||y — y'^°''''|p}t decreases with increasing 7 and becomes zero 
at 7 ~ 0.17. Our results shown in figures |2] (a) and (b) are consistent with this result. 

We also investigate the maximal Lyapunov exponent of the state y{t) = 
y'^°py(^t). The result is shown in figure El (d). The index is determined from 
the first eigenvalue of the product of the Jacobian matrix of 1)1 5|l with respect to the 
variables y , and an orbit with length 10^ is used for its numerical evaluation. There is 
an interval where Xr changes nonmonotonically against 7. Such nonmonotonic change 
is also observed in the graph of pj^'^^ vs. 7 as shown in the inset of figure|21(a). These 
results tell us that pj^'^^ defined by the Kernel CCA can characterize subtle change as 
well as global tendency of GS. 

3. 3. Influence of Noise and Sample Size 

We consider how the performance of the Kernel CCA is influenced by the introduction 
of the observational noise and the change of the size of training data. First we consider 
the influence of noise. In numerical simulation, for each variable of (|14|l and (|15|l 
normalized within the interval [0,1], Gaussian random numbers with the mean zero 
and the standard deviation g are added as the observational noise. Results are shown 
in figure 13 For a low noise level {g = 0.01), the graph of p™'*'^ vs. 7 is almost same 
as that for the noise free case. For higher noise levels {g = 0.05), there is a moderate 
decrease of py^'^^ in the whole interval of 7, however, the global tendency of the graph 
of p™'*'^ vs. 7 is not lost. The proposed approach is fairly robust against noise except 
for extremely high noise level {g = 0.3). 

Second, we study the infiuence of the size of training data. Figure 0] (a) shows 
the average of p'^^^ over 20 realizations as a function of 7 for several different values 
of the size N. With the exception of 7 = 0, the graph of pj^^^ vs. 7 does not depend 
much on N. The result shows that the Kernel CCA works even with relatively small 
size of training data. For the cases of 7 = and 7 = 0.25, the graphs of p'^^^ vs. N 
are shown in figure ^(b). Here, the vertical bars denote the corresponding standard 
deviation. The average of p'^'^^ for 7 = increases monotonically with decreasing N 
whereas the one for 7 = 0.25 does not almost change against N. The result with 7 = 
suggests that a proper choice of the kernel parameter for a given data is important 
for obtaining correct results. This issue is discussed in Section 6. 

3.4- Assessing Sensitivity of Kernel CCA to Nonlinear Structure: Surrogate Data 
Analysis 

In order to investigate the ability of the Kernel CCA to nonlinear dependence between 
two systems (|T^ and (fT^ , we use a method of surrogation|2H|- Multivariate surrogate 
data are generated as follows: first, the Fourier transform of the time series is 
calculated for each of variables, then the common random numbers are added to 
the phase variables, and finally the inverse Fourier transform is applied. The resulting 
multivariate time series have the same power spectra and cross spectra as those of the 
original time series. By changing random numbers added to the phases, an arbitrary 
number of different time series which preserve the linear properties of the original is 
obtained. See papers [291 130L 1^ for technical details. 
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In numerical simulation, 19 realizations of the surrogate data for the time series 
{(xi(t),yi(i))} of HU and (dHJ are prepared by using the TISEAN package [SI EHI- 
We take d = 4 and ^ = 1 for time-delay embedding. 

In figure |S1 (a) , the index p'^^^ defined by the Kernel CCA for the original data 
and that for the surrogate data as functions of 7 are shown. For the surrogate data, 
the average over 19 realizations is plotted as a function of 7, and the corresponding 
maximal and minimal values are also shown as the both edges of vertical bars. For 
both of the original and the surrogate data, the index p'^^^ increases with increasing 
7. Except for 7 = and 7 ~ 1, however, the value of p'^^^ for the original data is 
significantly higher than that for the surrogate data. For larger values of 7, p'j^'^^ for 
the surrogate data increases monotonically, and p^^^^ 1 as 7 — » 1. This coincides 
with the fact that the attractor of the systems p4|l and 1)15(1 is located around the 
plane Xi = yi and X2 = 2/2 and the relation between two systems becomes almost 
linear one. The results suggest that the Kernel CCA is sensitive to nonlinearity of the 
dependence between two systems. 

We also investigate the performance of the linear CCA in the same way and results 
are shown in figure (b) . In this case, the difference between the maximal canonical 
correlation coefhcients p"^^^ of the linear CCA for the original data and the one for 
the surrogate data is not significant for any value of 7. This indicates that the linear 
CCA can detect only the linear dependence between two systems. 
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Figure 4. (a) Dependence of the index p^''" on 7 for three different values of 
tfie size A'' of training data witfi a = 0.4, k = 0.1. (b) Dependence of the index 
^rnax jy f^j. ^ — Q ^nd 7 = 0.25 with CT = 0.4, K = 0.1. The average of p^^^ 
over 20 realizations are plotted as symbols in (a), and in addition to the average, 
the standard deviation is also plotted as vertical bars in (b). 
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Figure 5. The canonical correlation coefBcients p^i^'" of the Kernel CCA (a) and 
^max q£ ^-Jji^ lijiQar CCA (b) as functions of 7 for the original data and the surrogate 
data. For the surrogate data, the average, and the maximal and minival values of 
^max Qygj. ig realizations are plotted as symbols, and both edges of vertical bars, 
respectively. Parameters are d = 4, Z = 1,ct = 1.0, k = 0.1, and N = 10^ in (a) 
and d = 4,/ = 1 and AT = lO^ in (b). 
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3.5. The Regularization Parameter k 

As mentioned in the preceding section, the regularization terms are required for 
nontrivial results. Figure shows the dependence of p™^'' on the regularization 
parameter k for three different values of a. Although p™^'^ 1 for too small k, 
the value of p™'*'^ decreases gradually and does not depend on the precise value of k. 



sigma=0.1 - 

sigma=0.2 

sigma=0.4 



Figure 6. The index p'^" 
three different values of a. 



as a function of the regularization parameter k for 
7 = 0.3 and AT = 3 X 10^. 



4. Other Examples 

In order to illustrate the capability of the Kernel CCA in more complicated situations, 
we add the following three examples. 



4.1. Coupled Rossler-Lorenz Systems 

First, we consider GS in a Lorenz system driven by a Rossler svstem[TU): 



X 



Y 



= xi+ 0.2x2 (16) 
= 0.2 + 2:3 (xi -5.7), 

= 16(y2 - yi) - 7(yi - xi) 

= -ym + 45.92yi-y2 (17) 

= vm - 4:y3- 

The coupling term is introduced in the first equation of (|17|l . and 7 is its strength. 

We confirm that there is a sharp transition of GS at 7 ^ 4.8 by investigating 
the long time average of the synchronization error between the system H17(l and its 
identical copy driven by the common signal xi(t) of the system ()16|l. There coexist 
two attractors in the state space after the transition of GS and we choose one of them. 
Figures [71 (a) - (c) show the projections of the strange attractor of the system (|m|l 
and H17|l with 7 = 10 onto the planes of {xi,yi), (x2,2/2), and {xs^ys) respectively. 



Detecting Generalized Synchronization by A Kernel-based Approach 



13 



For each of pairs shown m figures[7|(a) - (c), the time series is embedded as points 
in a d-dimensional state space, and we apply the Kernel CCA. We set the embedding 
dimension d = 6 and the delay time I = 2.5. The size of training data is A'' = 10'^. 
Results are shown in figure |H1 (a). All indices p'^'^'^ shown in figure |S1 (a) take large 
values for 7 > 4.8. The value 7 ^ 4.8 agrees with the transition point of GS. In figure|Hl 
(b), results of the linear CCA applied to the same data are also shown. The indices 
^max q£ linear CCA for the cases of {xi vs. yi) and {x2 vs. 1/2) take large values 
after the GS transition as well as the indices p^^^ obtained by the Kernel CCA. For 
the case of (0:3 vs. 1/3), however, p™3,x ^j^g ijjjg^p CCA is smaller than p'^'^^ of the 
Kernel CCA. This result suggests that the Kernel CCA outperforms the linear CCA 
when the relation between two observed time series has high nonlinearity. 










-HB 












1 










-i:d 







Figure 7. Projections of tlie strange attractor of tlie coupled Rossler and Lorenz 
systems witli 7 = 10 onto the planes of {xi,yi) in (a), (x2,y2) in (b), and (xsjj/a) 
in (c). 



4.2. Neural Spike Trains Modulated by Chaotic Inputs 

Second, we analyze the following FitzHugh-Nagumo (FHN) neuron model modulated 
by chaotic dynamics of the Rossler model: 

= -t(x2 + 0:3) 

X : { X2 = t{xi + QMX2) (18) 
= t(0.4 + X3(xi -4.5)), 

y., ^r ={-2/i(yi-0.5)(2/i-l)-?y2 + 5(t)}/0.02 
' \ y2 = 2/1 - y2 - 0.15, 

where the term 

S'(i) = 0.23 + 0.0075a;i(t) (20) 

defines a chaotic inputs to the neuron, and r is a parameter that controls the dominant 
time scale of the Rossler dynamics. The system composed of (|18|) and H19() has been 
investigated from the viewpoint of the problem whether the information on the input 
signal can be decoded from the output interspike intervals (ISIs) generated by a neuron 
or not |32[ 1^ I34j . Here, we focus on the relation between the input chaotic stimulus 
and the output ISIs from the viewpoint of GS between two oscillators with different 
dynamics. 

We define the «-th ISI as Si — ti — ti^i where ti is onset time of the i-th spike 
defined as the time when the variable yi makes upward crossing over some fixed 
threshold yg. The value of ye is set to 0.7 here. We also define the chaotic stimulus 
associated with the i-th ISI {ti), which is the value of Xi in H18|l at ti. 

By using the delay embedding scheme, we transform the time series {ri} and {si} 
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Figure 8. (a) The canonical correlation coefficient p'^^^ of the Kernel CCA 
as a function of the coupling strength 7 for different pairs of variables with 
A'' = 10'^, d = 6,1 = 2.5, a = 1.5 and k = 0.1. (b) The canonical correlation 
coefficient p™^"' of the linear CCA as a function of the coupling strength 7 for 
different pairs of variables with N = 10'^, d = 6, and I = 2.5. The results for 
the case where the original state variables {xi,X2,X3) and (2/1,1/2,2/3) are used as 
training data are also shown. 
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into the state points {(r^, r^.i, r^.d^+i)} and {(s^, Sj_i, Si_<jr+i)} in and 
dy-dimensional state spaces, respectively. We set dx = dy = 3 here and apply the 
Kernel CCA to these data sets. 

In figureslHl (al)-(a3), significant nonlinear dependences between the input chaos 
and the output ISI are observed in scatter plots of {ri,Si). Figures El (bl)-(b3) also 
show scatter plots of the canonical variates of the Kernel CCA associated with figuresEl 
(al)-(a3). It is easy to see that the correlation between canonical variates u and v 
shown in figures El (bl)-(b3) correspond to the complexity of nonlinear dependence 
between and Xi shown in figures El (al)-(a3). 

The relation between and Si changes according to the value of the control 
parameter r. In figure El (c), the canonical correlation coefficient p'^^^ is plotted as a 
function of r, which visualizes the change of the input-output relation between two 
systems of ()18|l and ()19|l . The index p™^'^ changes nonmonotonically with the increase 
of r, and there is a regime around r ~ 4 where the value of py^'^^ is large. In addition 
to GS, chaotic phase synchronization (CPS) |3S1 occurs between two systems in this 
regime 13^. The increase of pj^'^^ in this regime can be attributed to the occurrence 
of CPS. 

4-3. Bidirectionally Coupled Systems 

The notion of GS is not resricted to unidirectinally coupled systems. In [3^1 137) . 
the occurrence of GS for bidirectionally coupled systems is also discussed. As an 
example of GS in bidirectinally coupled systems, we consider the following coupled 
Lorenz-Rossler systems [36| : 

{xi ^ 10{x2 - xi) + ^{yi - xi) 
X2 — 35xi — X2 — X1X3 (21) 
X3 ^ -{8/3)x3 + X1X2, 

{yi ^5.5y2-y3 + 7{xi-yi) 
m = 5.52;i + 0.165y2 (22) 
2/3 =0.2 + 2/3(2/1-10). 

A mutual interaction between a Lorenz system (|21|l and a Rossler system (|22f) is 
introduced as diffusion terms in the first equations of both H21|l and (|22|l , and 7 is its 
strength. In [3^1, Zheng et al. try to define GS in bidirectionally coupled systems by 
considering identical copies X' and Y' of the systems X and Y . Using the approach 
in 126 , two transitions of GS are found in the systems 121|l and H22|l . At 7 ~ 1.2, 
the Rossler system Y is entrained by the Lorenz system X in the sense that orbits of 
the systems Y and Y' completely coincide with each other by receiving the common 
signal of the system X. With the further increase of 7, the system X is also entrained 
by the system K at 7 ~ 12.3 which means that the complete synchronization between 
X and X' also occurs. 

We apply the Kernel CCA to the systems ()21|l and (|22|l and results are shown in 
figures ITUl and ITTI Figure ITUI (all, (bl), and (cl) show the projections of attractors 
of the systems (|21|l and (|22|) with 7 = 0.4, 4.0, and 13.0 onto the {xi^yi) planes, 
respectively. Figures ITUI fa2). (b2) and (c2) also show the corresponding results of the 
Kernel CCA. Here, we use a Gaussian kernel with a = 1.0. The state variables 
{{xi{ti),X2{ti),X3[ti)),{yi{ti),y2{ti),y3{ti))} where ti = iAt.i = l,...,iV,Ai = 
5.0, N = 2 X 10^ are used as the training data set. It is observed that the state 
of GS is clearly identified as the high linear correlation between canonical variates 
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Figure 9. (al - a3) Scatter plots of and Si of I18i and 1191 with r = 2.7 in 
(al), T = 4.0 in {a2), and t = 7.5 in (a3). (bl - b3) Scatter plots of canonical 
variates u and v of the Kernel CCA with r = 2.7 in (bl), t = 4.0 in (b2), and 
T = 7.5 in (bS). cr = 0.5, k = 0.1, and N = 2 X 10^. (c) The canonical correlation 
coefficient p^Ji^" of the Kernel CCA as a function of t with a = 0.5, k = 0.1, and 
Af = 2 X 10^ 
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Figure 10. (al, bl, cl) Projections of attractors of the bidirectionally coupled 
Lorcnz and Rossler systems onto the plane of (xi, j/i) with 7 = 0.4 in (al), 7 = 4.0 
in (bl) 7 = 13.0 in (cl). (a2, b2, c2) Scatter plots of the canonical covariates of 
the Kernel CCA with 7 = 0.4 in (al), 7 = 4.0 in (bl) 7 = 13.0 in (cl). Parameters 
are cr = 1.0, k = 0.1, and Af = 2 X 10^. 

u and V of the Kernel CCA. Application of the Kernel CCA to the bidirectionally 
coupled systems is straightforward while the approach of [SHI requires rather subtle 
procedures. Figure ^2 shows the index p'^^^ as a function of the coupling strength 7. 
There is a rapid increase of p™'^'^ against 7 in an interval < 7 < 1.2. This result 
is consistent with the first transition of GS defined in The second transntion of 
GS at 7 ~ 12.3 defined in is not seen in the graph of p™'*'^ vs. 7. A reason is 
that the value of p'^'^^ already becomes nearly one at 7 ~ 5. Another reason is that 
by definition, the proposed approach based on the Kernel CCA is insensitive to the 
directionality of synchronization. It will be interesting to study modifications of the 
approach which can deal with the directionality of synchronization. 

5. Nonstationary Change of Coupling Strength 

So far, we have focused on the dependence of the first eigenvalue p'^^^ on the coupling 
strength 7. We turn our attention to the eigenvector *(a,/3) in ifT^ and investigate 
changes of the structure of the dynamical system with the Kernel CCA. 

As an illustration, let us consider the problem of extracting nonstationary changes 
of the coupling strength 7 from time series data generated by the coupled Henon maps. 
An example of such changes is shown in figure El (b). When the value of 7 varies 
from 7o, an orbit leaves from the synchronization manifold M. with 70, and is attracted 
again to it when the value of 7 is restored to 70. As shown in figure El (a), the time 
series of the original variables does not tell us whether the orbit is lying on 

A4 or not at a given time t. By using the Kernel CCA, a deviation of the orbit from 
A4 will be detected as a large value of \ f{x) ~ g{y)\, where /(•) and g{-) are nonhnear 
transformations determined by the first eigenvector *(a,/3). 

Numerical experiments are performed with 70 = 0.6 in two different conditions, 
and results are shown in figures El(c) and (d), respectively. Figure El (c) shows the 




case where /(■) and g{-) are estimated from an orbit with 7 = 70, which is prepared 
separately from the orbit to be analyzed. In figure [T^ (c). time intervals where the 
values of 7 are different from 70 is clearly detected as successive bursts in the time 
series of \f{x) - g{y)\. 

In figure ^1 (d), we show the case where the orbit to be analyzed is also 
used for estimating /(•) and g{-). We see that even when estimation of nonlinear 
transformations is affected by "noise" from the points not lying on M, successive 
bursts are still observed in the time series of \f{x) — g{y) \ except for the one in a time 
interval 1600 < t < 1800. 

6. Choice of Kernel Parameters 

In the preceding section, we set the value of the kernel parameter such as the width a of 
a Gaussian kernel in an ad-hoc manner. If a is too small, the nonlinear transformations 
/(•) and (?(•) in Q cannot interpolate between data points of the training sample. 
Contrary, if a is too large, (0) cannot represent a highly nonlinear structure such as 
the synchronization manifold Ai of GS. Thus a proper choice of the kernel parameter 
is crucial to obtain the good performance of the method. In this section, we discuss 
how the kernel parameter can be suitably chosen from a given data set. 

A naive way of choosing a is to find the value of a that maximizes p^^'^. The 
dotted line with open circles in figurc^|shows the graph of an index p'^'^^ as a function 
of a when two Henon maps l|14() and (|15l) are uncoupled (7 = 0). Here, an orbit of 
p4|l and H15(l with length N = 10^ is used as a set of training data, and the average 
and the standard deviation over 20 realizations are plotted as symbols and vertical 
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Figure 12. (a) Time series of the variables xi and yi of the coupled Henon maps 
when the value of 7 changes temporally as shown in (b). (c, d) Time series of the 
difference of the canonical variates of the Kernel CCA with a = 0.1 and k = 0.1. 



bars, respectively. The index p'^^^ increases monotonically with the decrease of cr, and 
^max ^ ig attained in the limit of ct ^ 0, while there is no interaction between two 
systems X and Y. This monotonic tendency of p'^'^^ does not determine an optimal 
value of the kernel parameter a. 

As a way to overcome this difficulty, the following procedure is proposed. First, 
we set aside the data {{xn,yn)}n=i for assessing the performance of the Kernel CCA 
(we call this "test" data) separately from the data used for training the Kernel CCA. 
Then, we estimate the nonlinear transformations /(■) and g{-) from the training data, 
and calculate the correlation coefficient p'^ between the variables u„ = f{xn) and 
Vn = giVn), n = 1,2, N defined as 

CV ^ ((ttn - {iin)) ■ jVn - (Vn))) 

^{{Un-{Un)?-^{{in-{nn)f' 

where (•••) is the empirical average over N samples. This strategy for assessing the 
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performance of the estimated model with new data is regarded as a version of cross- 
validation (cv) [SHI Ellin]. 

First we check that spurious detection of synchronization can be avoided by the 
procedure based on CV. The sohd hne with filled squares in figure El shows the 
dependence of an index p^y on a with 7 = 0. When there is no interaction between 
two systems, the value of is nearly equal to zero for any a. Corresponding results 
using time-delay embedding are shown in figure IT^ In figure ITTI the dependencies of 
p'^^^ and p^y on a with 7 = are plotted for several different values of the embedding 
dimension d. Again, the value of p^y is almost equal to zero for any a and d, whereas 
the value of p'^'^^ as a function of a increases monotonically with the increase of d. This 
result suggests that the procedure based on CV effectively erases spurious detection 
even when data is embedded in a high-dimensional state space. 
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Figure 13. Result of cross-validation for two uncoupled Henon maps (7 = 0.0). 
The dotted line with open circles shows p^'^" vs. rr, and the solid line with filled 
squares shows p^y vs tr. We prepare 20 data sets with size N = 10^ for training 
the Kernel CCA, and a test data with size TV = 2 X 10^ for cross-validation. The 
average over 20 realizations are plotted as symbols. The standard deviation is 
also plotted as vertical bars, but the ones with small values of errors are hidden 
by symbols, k is set to 0.01. 



Next we show that the cross validation procedure is useful for choosing optimal 
values of a. Figures ^1 shows the values of p'^'^^ and as functions of a for the 
coupled Henon maps (|14|l and (|15|l with 7 = 0.25. The condition for the numerical 
experiment is the same as 7 = 0. For all of graphs, the value of p'^ with the dotted 
line takes its maximum at a nonzero value of a, whereas p™'*'^ increases monotonically 
with the decrease of a. This result indicates that we can choose this value as an 
optimal width of a Gaussian kernel. 
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Figure 14. Results of cross-validation with 7 = for four different values of 
the embedding dimension d. We take the delay time 1 = 1. The condition for 
numerical experiments is the same as figure [T^ 




Figure 15. Results of cross-validation for two coupled Henon maps with 7 = 0.25. 
The embedding dimension d = 2 in (a), d = 4 in (b), d = 6 in (c), and d = 8 
in (d), and the delay time / = 1. The condition for numerical experiments is the 
same as figure [T]^ 
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7. Conclusion 

In conclusion, we have proposed a new approach for analyzing GS in a unified 
framework of a kernel method. We have tested the proposed approach by applying it 
to several examples exhibiting GS, and demonstrated that the canonical correlation 
coefficient of the Kernel CCA is a suitable index for the characterization of GS. In 
addition, it has been shown that nonstationary changes of the coupling are detected 
from the time series by the difference between canonical variates of the Kernel CCA. 
It has been also discussed how the parameter of the kernel function can be suitably 
chosen from data by the procedure of cross-validation. Our experiments show that 
a method based on CV gives promising results in optimizing the parameter a. The 
cross-validation procedure is also useful to circumvent spurious detection of GS by 
overfitting. 

The approach based on the Kernel CCA provides not only an index for measuring 
nonlinear interdependence. It also provides global nonlinear coordinates, and these 
coordinates allow a representation of the interaction between the dynamical systems 
under investigation. Note that the linear CCA also provides global coordinates, but 
it cannot discriminate between linear and nonlinear relation. In this respect, it goes 
beyond conventional methods of analyzing GS. Our attempts open a new possibility 
of the kernel methods for analyzing complex dynamics observed in nonlinear systems. 
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